Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents

نویسندگان

  • Michail Salampasis
  • Georgios Paltoglou
  • Anastasia Giahanou
چکیده

This technical report presents the work which has been carried out using Distributed Information Retrieval methods for federated search of patent documents for the passage retrieval starting from claims (patentability or novelty search) task. Patent documents produced worldwide have manuallyassigned classification codes which in our work are used to cluster, distribute and index patents through hundreds or thousands of sub-collections. We tested different combinations of source selection (CORI, BordaFuse, Reciprocal Rank) and results merging algorithms (SSL, CORI). We also tested different combinations of the number of collections requested and documents retrieved from each collection. One of the aims of the experiments was to test older DIR methods that characterize different collections using collection statistics like term frequencies and how they perform in patent search. Also to experiment with newer DIR methods which focus on explicitly estimating the number of relevant documents in each collection and usually attain improvements in precision over previous approaches, but their recall is usually lower. However, the most important aim was to examine how DIR methods will perform if patents are topically organized using their IPC and if DIR methods can approximate the performance of a centralized index approach. We submitted 8 runs. According to PRES @100 our best DIR approach ranked 7 across 31 submitted results, however our best DIR (not submitted) run outperforms all submitted runs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Report on the CLEF-IP 2013 Experiments: Multilayer Collection Selection on Topically Organized Patents

This technical report presents the work which has been carried out using Distributed Information Retrieval methods for federated search of patent documents for the passage retrieval starting from claims (patentability or novelty search) task. Patent documents produced worldwide have manually-assigned classification codes which in our work are used to cluster, distribute and index patents throug...

متن کامل

CLEF-IP 2011: Retrieval in the Intellectual Property Domain

The patent system is designed to encourage disclosure of new technologies and novel ideas by granting exclusive rights on the use of inventions to their inventors, for a limited period of time. Before a patent can be granted, patent o ces around the world perform thorough searches to ensure that no previous similar disclosures were made. In the intellectual property terminology, such kind of se...

متن کامل

CLEF-IP 2012: Retrieval Experiments in the Intellectual Property Domain

The Clef-Ip test collection was rst made available in 2009 to support research in IR methods in the intellectual property domain. Since then several kinds of tasks, re ecting various speci c parts of patent expert's work ows, have been organized. We give here an overview of the tasks, topics, assessments and evaluations of the Clef-Ip 2012 lab.

متن کامل

CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain

The Clef Ip track ran for the rst time within Clef 2009. The purpose of the track was twofold: to encourage and facilitate research in the area of patent retrieval by providing a large clean data set for experimentation; to create a large test collection of patents in the three main European languages for the evaluation of cross lingual information access. The track focused on the task of prior...

متن کامل

Passage Retrieval Starting from Patent Claims A Clef-Ip 2013 Task Overview

Most of the searches a patent expert at a patent o ce does are using boolean methods to query large databases of patent data. The Clef-Ip evaluation track is designed to experiment with information retrieval techniques on the patent domain. The data corpus in the Clef-Ip Lab consists of patent documents published by the European Patent Ofce. One of the main tasks in the Lab has been related to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012